AITopics | Kazan

Collaborating Authors

Kazan

International underwater cable attacks by Russia, China are no 'mere coincidence' warns EU's top diplomat

FOX NewsJan-14-2025, 09:00:24 GMT

Attacks on underwater cables running through strategically significant bodies of water in both the Baltic Sea and the South China Sea by Russia and China, respectively, in recent months has top officials concerned they are not "mere coincidence." Maritime sabotage efforts in both regions of the world appear to have been on the rise over the last several years, with a notable spike in recent months after at least three separate attacks occurred in as many months, beginning in November, and the top suspects are Russia and China. "The Kremlin has been running a hybrid campaign against Europe for years, ranging from spreading disinformation and cyberattacks to weaponizing energy supplies. Since Russia's full-scale invasion of Ukraine, these efforts have intensified dramatically," EU High Representative Kaja Kallas told Fox News Digital. "However, Russia is not the only challenge we face."

cable, china, russia, (12 more...)

FOX News

Country:

Asia > Russia (1.00)
Atlantic Ocean > North Atlantic Ocean > Baltic Sea (0.30)
North America > United States (0.30)
(15 more...)

Industry:

Government > Military (1.00)
Government > Regional Government > Europe Government > Russia Government (0.50)
Government > Regional Government > Asia Government > Russia Government (0.50)

Technology:

Information Technology > Artificial Intelligence (0.49)
Information Technology > Security & Privacy (0.35)

Add feedback

Semantic Component Analysis: Discovering Patterns in Short Texts Beyond Topics

Eichin, Florian, Schuster, Carolin M., Groh, Georg, Hedderich, Michael A.

arXiv.org Artificial IntelligenceDec-16-2024

Topic modeling is a key method in text analysis, but existing approaches are limited by assuming one topic per document or fail to scale efficiently for large, noisy datasets of short texts. We introduce Semantic Component Analysis (SCA), a novel topic modeling technique that overcomes these limitations by discovering multiple, nuanced semantic components beyond a single topic in short texts which we accomplish by introducing a decomposition step to the clustering-based topic modeling framework. We evaluate SCA on Twitter datasets in English, Hausa and Chinese. It achieves competetive coherence and diversity compared to BERTopic, while uncovering at least double the semantic components and maintaining a noise rate close to zero. Furthermore, SCA is scalable and effective across languages, including an underrepresented one.

bertopic, dataset, semantic component, (13 more...)

arXiv.org Artificial Intelligence

2410.21054

Country:

Asia > Russia (0.28)
Asia > China (0.04)
North America > Canada (0.04)
(21 more...)

Genre: Research Report (1.00)

Industry:

Media > News (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Leisure & Entertainment > Sports (0.93)
(3 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

HJ-Ky-0.1: an Evaluation Dataset for Kyrgyz Word Embeddings

Alekseev, Anton, Kabaeva, Gulnara

arXiv.org Artificial IntelligenceNov-28-2024

One of the key tasks in modern applied computational linguistics is constructing word vector representations (word embeddings), which are widely used to address natural language processing tasks such as sentiment analysis, information extraction, and more. To choose an appropriate method for generating these word embeddings, quality assessment techniques are often necessary. A standard approach involves calculating distances between vectors for words with expert-assessed 'similarity'. This work introduces the first 'silver standard' dataset for such tasks in the Kyrgyz language, alongside training corresponding models and validating the dataset's suitability through quality evaluation metrics.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.56634/16948335.2023.4.1723-1731

2411.10724

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Germany > Saxony > Leipzig (0.09)
Asia > Russia (0.05)
(7 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)

Add feedback

KyrgyzNLP: Challenges, Progress, and Future

Alekseev, Anton, Turatali, Timur

arXiv.org Artificial IntelligenceNov-15-2024

Large language models (LLMs) have excelled in numerous benchmarks, advancing AI applications in both linguistic and non-linguistic tasks. However, this has primarily benefited well-resourced languages, leaving less-resourced ones (LRLs) at a disadvantage. In this paper, we highlight the current state of the NLP field in the specific LRL: kyrgyz tili. Human evaluation, including annotated datasets created by native speakers, remains an irreplaceable component of reliable NLP performance, especially for LRLs where automatic evaluations can fall short. In recent assessments of the resources for Turkic languages, Kyrgyz is labeled with the status 'Scraping By', a severely under-resourced language spoken by millions. This is concerning given the growing importance of the language, not only in Kyrgyzstan but also among diaspora communities where it holds no official status. We review prior efforts in the field, noting that many of the publicly available resources have only recently been developed, with few exceptions beyond dictionaries (the processed data used for the analysis is presented at https://kyrgyznlp.github.io/). While recent papers have made some headway, much more remains to be done. Despite interest and support from both business and government sectors in the Kyrgyz Republic, the situation for Kyrgyz language resources remains challenging. We stress the importance of community-driven efforts to build these resources, ensuring the future advancement sustainability. We then share our view of the most pressing challenges in Kyrgyz NLP. Finally, we propose a roadmap for future development in terms of research topics and language resources.

kyrgyz nlp, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2411.05503

Country:

Asia > Russia (0.14)
Europe > Germany > Saxony > Leipzig (0.05)
Asia > Kyrgyzstan > Chüy Region > Bishkek (0.04)
(19 more...)

Genre:

Research Report (1.00)
Overview > Growing Problem (0.34)

Industry:

Government (1.00)
Media > News (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(4 more...)

Add feedback

Denoising ESG: quantifying data uncertainty from missing data with Machine Learning and prediction intervals

Caprioli, Sergio, Foschi, Jacopo, Crupi, Riccardo, Sabatino, Alessandro

arXiv.org Artificial IntelligenceJul-29-2024

Environmental, Social, and Governance (ESG) datasets are frequently plagued by significant data gaps, leading to inconsistencies in ESG ratings due to varying imputation methods. This study addresses the missing data issues in ESG datasets using machine learning techniques, comparing K-Nearest Neighbors, Gradient Boosting, Multiple Imputation by Chained Equations (MICE) and Neural Networks. We focus on quantifying the risk induced by data anomalies and provide tools to assess the impacts of this risk on the variability of the scores. By introducing prediction uncertainty using methods such as Predictive Mean Matching and Local Residual Draw, in order to assign confidence measures to individual predictions, we provide a nuanced understanding of prediction uncertainty. Empirical analyses show that these methods improve imputation accuracy and quantify uncertainty, which is required for reliable ESG scoring in banking and finance.

counterparty, dataset, imputation, (16 more...)

arXiv.org Artificial Intelligence

2407.20047

Country:

Europe > Italy (0.04)
Europe > Russia > Volga Federal District > Republic of Tatarstan > Kazan (0.04)
Europe > Germany > Berlin (0.04)
Asia > Russia (0.04)

Genre: Research Report (0.64)

Industry: Banking & Finance (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.55)

Add feedback

SumHiS: Extractive Summarization Exploiting Hidden Structure

Pavel, Tikhonov, Ianina, Anastasiya, Malykh, Valentin

arXiv.org Artificial IntelligenceJun-12-2024

Extractive summarization is a task of highlighting the most important parts of the text. We introduce a new approach to extractive summarization task using hidden clustering structure of the text. Experimental results on CNN/DailyMail demonstrate that our approach generates more accurate summaries than both extractive and abstractive methods, achieving state-of-the-art results in terms of ROUGE-2 metric exceeding the previous approaches by 10%. Additionally, we show that hidden structure of the text could be interpreted as aspects.

representation, sumhis, summarization, (13 more...)

arXiv.org Artificial Intelligence

2406.08215

Country:

Asia > South Korea (0.14)
North America > United States (0.14)
Asia > Russia (0.05)
(6 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Quantum Circuit for Random Forest Prediction

Safina, Liliia, Khadieva, Kamil, Zinnatullina, Ilnar, Khadieva, Aliya

arXiv.org Artificial IntelligenceDec-28-2023

In this work, we present a quantum circuit for a binary classification prediction algorithm using a random forest model. The quantum prediction algorithm is presented in our previous works. We construct a circuit and implement it using qiskit tools (python module for quantum programming). One of our goals is reducing the number of basic quantum gates (elementary gates). The set of basic quantum gates which we use in this work consists of single-qubit gates and a controlled NOT gate. The number of CNOT gates in our circuit is estimated by $O(2^{n+2h+1})$ , when trivial circuit decomposition techniques give $O(4^{|X|+n+h+2})$ CNOT gates, where $n$ is the number of trees in a random forest model, $h$ is a tree height and $|X|$ is the length of attributes of an input object $X$. The prediction process returns an index of the corresponding class for the input $X$.

algorithm, prediction algorithm, probability, (14 more...)

arXiv.org Artificial Intelligence

2312.16877

Country:

Europe > Russia > Volga Federal District > Republic of Tatarstan > Kazan (0.04)
Asia > Russia (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.86)

Add feedback

Dealing with Sparse Rewards Using Graph Neural Networks

Gerasyov, Matvey, Makarov, Ilya

arXiv.org Artificial IntelligenceOct-15-2023

Reinforcement learning is a machine learning paradigm where an artificial agent learns the optimal behavior through interactions with a dynamic environment. Goals and purposes are explained to the agent via a scalar reward signal it receives after each interaction. Throughout the training process, the agent infers the behavior that maximizes cumulative reward, also called the return. To succeed in this task, the agent needs to explore the environment to understand which states and actions yield high rewards. On the other hand, the agent also has to exploit the rewards it has already received to adapt its behavior. This problem is known as the exploration and exploitation trade-off. This work was supported in part on Section 2 by the Strategic Project "Digital Business" within the framework of the Strategic Academic Leadership Program "Priority 2030" at the National University of Science and Technology (NUST) MISiS, in part by the Basic Research Program at the National Research University Higher School of Economics (HSE University), and in part by the Computational Resources of HPC Facilities at HSE University.

learning, reinforcement, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ACCESS.2023.3305927

2203.13424

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.05)
Europe > Russia > Volga Federal District > Nizhny Novgorod Oblast > Nizhny Novgorod (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(7 more...)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games > Computer Games (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Unmasking Parkinson's Disease with Smile: An AI-enabled Screening Framework

Adnan, Tariq, Islam, Md Saiful, Rahman, Wasifur, Lee, Sangwu, Tithi, Sutapa Dey, Noshin, Kazi, Sarker, Imran, Rahman, M Saifur, Hoque, Ehsan

arXiv.org Artificial IntelligenceAug-3-2023

Parkinson's disease (PD) diagnosis remains challenging due to lacking a reliable biomarker and limited access to clinical care. In this study, we present an analysis of the largest video dataset containing micro-expressions to screen for PD. We collected 3,871 videos from 1,059 unique participants, including 256 self-reported PD patients. The recordings are from diverse sources encompassing participants' homes across multiple countries, a clinic, and a PD care facility in the US. Leveraging facial landmarks and action units, we extracted features relevant to Hypomimia, a prominent symptom of PD characterized by reduced facial expressions. An ensemble of AI models trained on these features achieved an accuracy of 89.7% and an Area Under the Receiver Operating Characteristic (AUROC) of 89.3% while being free from detectable bias across population subgroups based on sex and ethnicity on held-out data. Further analysis reveals that features from the smiling videos alone lead to comparable performance, even on two external test sets the model has never seen during training, suggesting the potential for PD risk assessment from smiling selfie videos.

artificial intelligence, machine learning, participant, (19 more...)

arXiv.org Artificial Intelligence

2308.02588

Country:

Asia > Bangladesh (0.06)
North America > United States > Ohio (0.04)
North America > United States > Alaska (0.04)
(9 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Parkinson's Disease (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

OntoMath${}^{\mathbf{PRO}}$ 2.0 Ontology: Updates of the Formal Model

Kirillovich, Alexander, Nevzorova, Olga, Lipachev, Evgeny

arXiv.org Artificial IntelligenceMar-17-2023

This paper is devoted to the problems of ontology-based mathematical knowledge management and representation. The main attention is paid to the development of a formal model for the representation of mathematical statements in the Open Linked Data cloud. The proposed model is intended for applications that extract mathematical facts from natural language mathematical texts and represent these facts as Linked Open Data. The model is used in development of a new version of the OntoMath${}^{\mathrm{PRO}}$ ontology of professional mathematics is described. OntoMath${}^{\mathrm{PRO}}$ underlies a semantic publishing platform, that takes as an input a collection of mathematical papers in LaTeX format and builds their ontology-based Linked Open Data representation. The semantic publishing platform, in turn, is a central component of OntoMath digital ecosystem, an ecosystem of ontologies, text analytics tools, and applications for mathematical knowledge management, including semantic search for mathematical formulas and a recommender system for mathematical papers. According to the new model, the ontology is organized into three layers: a foundational ontology layer, a domain ontology layer and a linguistic layer. The domain ontology layer contains language-independent math concepts. The linguistic layer provides linguistic grounding for these concepts, and the foundation ontology layer provides them with meta-ontological annotations. The concepts are organized in two main hierarchies: the hierarchy of objects and the hierarchy of reified relationships.

artificial intelligence, ontology, ontomath, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1134/S1995080222150136

2303.13542

Country:

Asia > Russia (0.05)
Europe > Russia > Volga Federal District > Republic of Tatarstan > Kazan (0.04)
Europe > Czechia > Prague (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)

Add feedback